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00 ! Abstract 

o 

The Jones and HOMFLY polynomials are link invariants with close connections to quantum com- 
rvj ■ puting. It was recently shown that finding a certain approximation to the Jones polynomial of the trace 

closure of a braid at the fifth root of unity is a complete problem for the one clean qubit complexity 
class [18|. This is the class of problems solvable in polynomial time on a quantum computer acting on an 
initial state in which one qubit is pure and the rest are maximally mixed. Here we generalize this result 
by showing that one clean qubit computers can efficiently approximate the Jones and single-variable 
HOMFLY polynomials of the trace closure of a braid at any root of unity. 

j3 ■ 1 Introduction 

A knot is an embedding of the circle into three dimensional space. More generally, a link is an embedding 
of one or more circles into three dimensional space. A link is said to be oriented if one of the two possible 
2 . orientations is chosen for each circle. Examples are shown in figure [I] 

rji Two links are equivalent if one can be continuously deformed into the other without cutting any strands. 

One of the most fundamental tasks in the theory of links is to determine whether a given pair of links is 
equivalent. Although this task appears easy in the simple examples of figure [T] it rapidly becomes difficult 
for links of many crossings. No polynomial time algorithm for this problem is known. Currently the best 
qq ' upper bound on the complexity of the link equivalence problem is that it is contained in NP [10) . 

Link invariants are one tool for distinguishing links. A link invariant is some function / on links such 
^sO • that if link L is equivalent to link L' then f(L) — f(L'). There may exist inequivalent links that a given 

link invariant fails to distinguish. The Jones polynomial is an important link invariant that has been very 
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successful in distinguishing inequivalent links. It was discovered in 1985 by Vaughan Jones [12] . For an 
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oriented link L with m crossings, the corresponding Jones polynomial V?(t) is a polynomial consisting of a 
linear combination of integer and half-integer powers of t. Vg(i) has degree at most 0(m), and the coefficients 
in the polynomial are all integers. That is, Vr(t) S Z^ 1 ' 2 ,^ -1 ' 2 ]. The coefficients may be exponentially 
large, and finding their values exactly is known to be #P-complctc 11 . 

In order to formulate computational problems about links, one needs a way to input links into a computer. 
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One way to do this is to use the discrete language of the braid group. A braid of n strands has n pegs across 
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Figure 1: Shown from left to right are the unknot, another representation of the unknot, an oriented trefoil knot, 
and the Hopf link. Broken lines indicate undercrossings. 
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Figure 2: On the left we have a braid of four strands. The strands must move steadily downwards, thus the object 
on the right is not a braid. 



the top and n pegs across the bottom. Each top peg is the starting point of exactly one strand. Each bottom 
peg is the end point of exactly one strand. On the way, the strands can wind around each other in any 
arbitrary way, but cannot "double back," as illustrated in figured) Two braids are equivalent if one can be 
deformed into the other without cutting any strands. 

The set of braids on n strands has the structure of a group. The group operation is concatenation of 
braids, as shown below. 
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The n-strand braid group B n is generated by the elementary crossings <7i, . . . , er n _i as illustrated below. 
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For example, the braid of figure [2] is a 1 <73<72- The topological equivalence of braids is completely captured 
by the following two relations among the group generators. 



<7i + lC7j<7j +1 = O l O i +\O l 



for \%-j\ > 2 
for all i 



(1) 



By joining the free ends of a braid, one can construct a link. Figure [3] illustrates two ways of doing this: the 
plat closure and the trace closure. Alexander's theorem states that any link can be obtained as the trace 
closure of some braid. The same is true of the plat closure 18 . 

In addition to gaining a convenient way for inputting links into computers, by thinking of links in terms 
of the braid group, we gain an algebraic point of view on the topological problem of distinguishing links. 
Jones originally formulated his polynomial in terms of certain representations of the braid group [12"]. This 
original representation-theoretic formulation is also convenient for use in quantum computation. We'll now 
describe it. 

Let 6 be a braid of n strands and let 6 tr be the link obtained by taking its trace closure. If each strand 
of the braid is oriented downward, then an oriented link L results from taking the trace closure. The Jones 
polynomial of L at t — e l27r / fc is 
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Figure 3: Shown from left to right are a braid, its plat closure, and its trace closure. 



where w(L) is the "writhe" of L. A crossing of oriented strands of the form A is considered positive, and a 
crossing of the form A is considered negative. w(L) is equal to the number of positive crossings minus the 
number of negative crossings in L. p n ^ is the path model representation of the braid group B n . Tr is a certain 

weighted trace known as the Markov trace. It is clear that the prefactor (— ie t7r ' 2fc ) (—2 cos(7r/fc))™ is 

easy to calculate, thus the problem of evaluating Jones polynomials polynomial-time reduces to the evaluation 
of the Markov trace of the path model representation. 

In 1989, Witten proved that the Jones polynomial arises as a Wilson loop in Chern-Simons theory, 
thereby uncovering a connection between topological quantum field theory and knot invariants 19 . In 2002, 
Freedman et al. showed that quantum computers can efficiently simulate certain topological quantum field 
theories [8], and furthermore that the problem of simulating these topological quantum field theories is BQP- 
complete [5]. The results of Freedman et al. combined with that of Witten imply that quantum computers 
can efficiently estimate the Jones polynomial of the plat closure of a braid at t = e l27T ^ 5 and furthermore that 
this problem is BQP-complctc. Aharonov et al. subsequently generalized this result, showing that quantum 
computers can efficiently estimate the Jones polynomial of the plat or trace closure of a braid at t — e i27r / fc 
for any k [5] . In [TJ [7J HOI [9] , the problem of estimating the Jones polynomial of the plat closure of a braid 
was shown to be BQP-complete for each k other than 1,2,3,4, and 6. The problem of estimating the Jones 
polynomial of the trace closure of a braid at t = e l27T ^ 5 was shown in [TBJ to be complete for the one clean 
qubit complexity class, called DQC1. 

Whereas the Jones polynomial of the trace closure of a braid is proportional to the Markov trace of its 
path model representation, the Jones polynomial of the plat closure of a braid is proportional to a certain 
matrix element of its path model representation. For t — e l27l / k ^ the path model representation is unitary. 
The dimension of the representation is in general exponential in n. Thus the direct classical algorithm 
for calculating the representation of a braid by multiplying the matrices representing individual crossings 
requires exponential time. In contrast, a quantum circuit on n qubits corresponds to an element of U(2 n ). 
By the path model representation, a braid on n strands corresponds to an exponentially large unitary matrix, 
which in turn corresponds to a quantum circuit on poly(n) qubits. The nontrivial achievement of [Bl[3~l l20U18] 
is to show that the number of gates in the quantum circuit need only grow polynomially with the number 
of crossings in the braid. 

Such a correspondence between braids and quantum circuits forms the core of the completeness proofs 
for Jones polynomial problems. Estimating a matrix element of a quantum circuit to polynomial precision is 
BQP-complete, and estimating the normalized trace of a quantum circuit to polynomial precision is DQC1- 
complete. Constructing the correspondence between braids and circuits is slightly more involved in the case 
of DQCl-completeness essentially because the circuit can only use logarithmically many ancilla qubits 18J. 

The approximations to Jones polynomials obtained by quantum computers are additive. The Markov 
trace Ti{p n ^{b)) has magnitude at most one. The quantum algorithm for approximating the trace closure 
produces an estimate e satisfying |e — Tr(p n ,k(b))\ < e with probability 1 — S in poly(l/e, log(l/S)) time. It 



is important to distinguish this from the other common type of approximation known as a Fully Polynomial 
Randomized Approxation Scheme (FPRAS). An FPRAS for a function / produces an estimate e satisfying 
(1 — e)f < e < (1 + e)f with probability 1 — S in time poly(l/e, log(l/<5)). For many braids b G B n , 
|Tr(p rlj fc(&))| is exponentially small compared to one. For these instances, an FPRAS is exponentially more 
precise than a polynomial additive approximation. 

The discovery of the Jones polynomial broke open a new field. A number of new and powerful knot 
invariants related to the Jones polynomial were soon discovered. The HOMFLY polynomials is one of 
these. Like the Jones polynomial, the HOMFLY polynomial is an invariant of oriented links. In general the 
HOMFLY polynomial is a polynomial in two variables, H^(t,x) £ Z[i, i _1 , a;, a; -1 ]. An important special 
case is the single- variable HOMFLY polynomial 

also known as the sl r invariant. As discussed in the appendix, the Jones polynomial is equivalent to the 
r = 2 special case of the single- variable HOMFLY polynomial. In [20], Wocjan and Yard showed that 
quantum computers can efficiently approximate single-variable HOMFLY polynomials at arbitrary roots of 
unity. The HOMFLY polynomial is in turn a special case of an extremely general combinatorial object called 
the Tutte polynomial. Aharonov et al. have obtained efficient quantum algorithms for approximating Tutte 
polynomials [2 J. It is not yet fully known for what range of parameters the approximation obtained in [2] is 
BQP-hard. 

The one clean qubit model was introduced in |13j as an idealized model of quantum computation on 
highly mixed states. For example, the states manipulated in NMR experiments are typically highly mixed. 
One clean qubit computers are believed to be less powerful than standard quantum computers but still 
capable of solving some problems outside of P. In the one clean qubit model one is given an initial state 
consisting of one qubit in the pure state |0) and n qubits in the maximally mixed state. In other words, the 
initial density matrix is 

P=|0><0|®^, 

where / is the 2" x 2™ identity matrix. One is then allowed to apply polynomially many quantum gates 
to this state, and then do a single-qubit measurement in the computational basis. This procedure can be 
repeated polynomially many times, each time starting with the same initial state p. The set of decision 
problems solvable by this procedure is called DQC1. 

Here we show that one clean qubit computers can efficiently estimate Jones and HOMFLY polynomials 
at arbitrary roots of unity, generalizing the result of [18 . To do this we need only two facts about one 
clean qubit computers. First, one clean qubit computers can efficiently estimate the normalized trace of 
quantum circuits to polynomial precision^. That is, we are given a classical description of a quantum circuit 
on n qubits with poly(n) gates. This quantum circuit implements some unitary transformation U on a 
2™-dimensional Hilbert space. The quantity 2 „ ' is a complex number of magnitude at most one. One 



Tij ., 



Tr[C7] 



< e in 



clean qubit computers can produce an estimate of Ty such that with probability 1 — S, 

time poly(l/e, log(l/<5)). Second, a computer with one clean qubit can simulate a computer with O(logn) 
clean qubits with polynomial overhead. Both of these facts are discussed thoroughly in [Jj5]. For additional 
information about one clean qubit computers we refer the interested reader to [13j [TH [16j HZl HI HB1 E] ■ 

2 Path Model Representation of B n 

As discussed in section[T]and reference [3], the problem of estimating the Jones polynomial of the trace closure 
of a braid reduces to the problem of estimating the Markov trace of the braid's path model representation. 



The name HOMFLY stands for the names of the discoverers of this invariant: Hoste, Ocneanu, Millett, Freyd, Lickorish, 
and Yetter. Some authors prefer the name HOMFLYPT polynomial to recognize the contributions of Przytycky and Traczyk. 
We will use the term HOMFLY polynomial simply because it is more widespread. 

2 Although we do not need this fact here, it is interesting to note that the decision version of this problem is DQCl-complctc. 



In this section we present the path model representation of the braid group B n , and the Markov trace of 
this representation. 

Let Q n ,k be the set of paths of n steps on a ladder of k — 1 rungs that start at the bottom. For example 
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Let V n ^k be the formal span of Sl n ,k and let V n k be its dual. For example 
V4.4 
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span 
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where for any p,q € f2 n ,fc : 

(Pi?) = <W 

For any n, k € N the path model representation p K] fc is a homomorphism from B n , the n-strand braid group, 
to U(V n ,k), the group of unitary transformations on V n ,k- 
Let <Ji denote the crossing of strands i and i + 1: 



Pn.k(&i) acts only on steps i and i + 1 of paths in f2 nj fe leaving the other steps unchanged. Specifically 

step i step i + 1 



Pn,k{o~i) 




= CLl 



And similarly, 
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where: 

A = ie' l7I l 2k \i = sinf^ 
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e; - A- 1 fi = A 



ai = A~ l +A^ q = A^+A- 



x, ^ - - i - A , 



4 v^i+i-^i-i 



These rules completely define the representation p^fc. 

Let £l nt k,h be the set of paths in fl Ut k that end on rung h. Let Vn^/j be the corresponding |£l ni fc,h|- 
dimcnsional vector space. p n ,k{pi) leaves h unchanged for all i, as one can see from the preceding rules. 
Thus, for each h, these rules define a representation p n ,k,h '■ B n — ► U(V n ,k,h)- Pn,k is the direct sum of these. 



fc-i 



Pn,k( a i) = @ Pn,k,h{o-i) 



h=l 



The Markov trace of the representation p n ^ is given by: 
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Tr(p rhk (b)) = _ 1 Y^ Tr [pn,k,h( b )] X h (3) 

l^h=l Ah|"n,fe,ft.| h=1 



where Tr is the ordinary matrix trace, and A^ = sin (-^). 
In section [4] we show how to estimate the normalized trace 
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rTr[p n>fe ^(6)] (4) 



on a one clean qubit computer for each h. Given the ability to do this, it is a simple matter to obtain the full 
Markov trace. By equation[3]we see that we can obtain the Markov trace by classically sampling h according 
to the distribution 

P\h) = — — • 

J2h=l ^h\Qn,k,h\ 

For each h obtained by sampling from this distribution, we use a one clean qubit computer to estimate 
the corresponding normalized trace of equation [4j By construction, the average obtained by this sampling 
procedure will converge to the Markov trace. By taking polynomially many samples, one can obtain the 
Markov trace to polynomial precision. The probability distribution p{h) is easy to sample from because h 
can take on only fc — 1 different values, and each p(h) is furthermore easy to compute. 

To estimate the normalized trace (eq. U) on a one clean qubit computer, we introduce an encoding rjh 
from bits to paths 

Vh : {0, l} n/3 -> n n>k>h , 

where (3 is a parameter whose value we determine in section |3] r\h is a non-injective map. However, the 
number of different bitstrings that map to a given path is approximately the same for all paths. That is, 
l?^ 1 ^)! ~ 2 nf3 /\fl n _k.h\ for all uj 6 fi n ,k,h where |^ 1 (cj)| is the number of bitstrings in {0,1}™^ that get 
mapped to to. 

For any b G B n with poly(ra) crossings we obtain a quantum circuit U pm (b) of poly(n) gates such that 
for almost all x, y e {0, 1}™^, 

(x\U pm (b)\y) ~ {ilh{x)\p n ,k,h{b)\ri h {y)) . (5) 

In other words, this quantum circuit implements the path model representation of braid b. Thus 

-±pTr[U pm (b)} = -L J2 (x\U pm (b)\*) 

a:e{0,l}'" 3 



^g Y (Vh(%)\pn,k,hQ>)\Vh(x)) 

a:e{0,l}™' 3 

- its r Y (u\p n ,k,h(b)\u) 

"n,fc,h c ^ 

Tr [p„,fe,ft,(6)] . 
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With such an encoding we are able to use n/3 maximally mixed qubits to obtain a uniformly weighted trace 
over all paths in £l n ,k,h- 

3 Encoding Paths as Bitstrings 

To describe rjh we imagine a randomized classical algorithm which uses n(3 random bits to produce an element 
of Qn.k.h approximately uniformly at random. Such an algorithm corresponds to a map from {0, l} n/3 to 
Qn,k,h, and the fact that the probability distribution over paths is approximately uniform ensures that 
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\£ln,k,h\ 



for all u> € £l n ,k,h- This algorithm is not to be run, but is rather a conceptual tool for the design of the 
encoding %. 

The algorithm works by starting at the bottom rung, and adding steps one by one until a path of n steps 
is obtained. Let Q^(a, a') be the number of paths of n steps on a ladder of k — 1 rungs which start at rung 
a and end at rung a' . Suppose the current path has t steps and ends at rung a. There are Q^_ t (a + 1, h) 
completions of this path in which step t + 1 is upward and Q„_ t (a — 1, h) completions in which step t + 1 is 
downward. Thus the algorithm chooses step t + 1 to be upward with probability 

( A Q k n - t {a + l,h) 
Pup(a,t) = -r — - — fc — — ;-. (6) 

(To cover the cases a = 1 and a = k — 1 we define Q\ h — Qq h = 0.) By choosing each step according 
to equation [SI one obtains at the end a uniform distribution over n n ,k,h- This is illustrated in figured] 
To generate a path of n steps, we use n registers of (3 random bits. We think of the registers as encoding 
numbers r\, . . . ,r n in the range 0, 1, . . . , 2' 9 — 1. The t th step is chosen to be up if and only if 

r t < \p up (a,t)2^. (7) 

Note that if the path has reached the top or bottom rung p up can equal 1 or 0. 

If p up (a, t) were implemented exactly then the paths would be produced with exactly uniform probability. 
Because of the rounding shown in equation[7l eachp up (a, t) is only accurate to within ±2 -/3 . Correspondingly, 
the number of bitstrings that get mapped by r\h to a given path is not precisely the same for all paths. This 
introduces an error into the estimate of the normalized trace given by 
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Figure 4: Here the transition probabilities are illustrated in the randomized algorithm for producing paths of three 
steps on a ladder of four rungs. Using the rule of equation [6] the final probabilities come out uniform. 



By the definition of r/h this is 



E IO 



^2 Pum(v)(w\Pn,k,h(b)\u>) - ^ P(u){w\p n ,k,h(b)\w) 



where p(u>) is the distribution over paths produced by the classical algorithm using j3 bits of precision, and 



Puni(w) 



\0„ 



is the uniform distribution. By the triangle inequality, 

Ground < J^ \(j>uni(u) - p(u>)) (Ul\p n! k,h(b)\u>)\ . 



Because p n .k.h is unitary this gives us 



= llPuni — Pill - 



■p(w)\ 



(8) 



Here we are thinking of probability distributions as vectors and measuring their distance using the 1-norm. 
For the purpose of estimating Jones polynomials in DQC1, one wants to estimate the normalized trace 
to polynomial precision. Thus it suffices to have 
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(9) 



As proven below, to satisfy the condition [9l it is sufficient to implement each p up to polynomial precision. 
Thus it is sufficient to choose (3 = O(logn). 
Let 

n 

&<n,k = [J &>t,k- 
t=0 

As shown in figure [U our classical probabilistic algorithm can be thought of as a Markov process on £l< n ,k- 
Each element in Qt,k probabilistically transitions to one of two possible elements in £l t +i,k with probabilities 
p ap (a,t) and 1 — p up (a 1 t). Hence we can define a |f2<„ j / c |-dimensional stochastic matrix M representing 
our idealized algorithm. Each row contains at most two nonzero entries which are p up (a — l,t — 1) and 



1 — p U p(a + 1, t — 1). The initial probability distribution p$ on £l< n ,k has probability one on the zero step 
path: 



After choosing t steps, the probability distribution is 

p t = M l p 

which has support only on paths of t steps. We define M analogously to M, except that instead oi p up (a, t) 
and 1 — Pup (a, t) the entries represent the actual transition probabilities obtained using /3 bits of precision. 
Thus, in each row, M and M have at most two nonzero entries (at the same places) and these entries differ 
by at most e = 2~^. Thus, in each row, A = (M — M) has only two nonzero entries, each of magnitude 
bounded by e. Hence for any probability distribution p, 
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(10) 



Let pt be the probability distribution obtained on i-step paths by the actual algorithm and let p t be that 
obtained by the idealized algorithm. Further, let E t = \\pt — Pt\\i- 



So: 
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p t = M*po 


t+1 = 


= \\Pt+i - 
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= \\Mpt- 
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= \\Mpt- 


-Mpt+Mpt-Mpt\\i 



and by the triangle inequality 

< \\Mp t - Mpth + \\Mp t - Mpt\\i 
- \\M(p t -p t )\\ 1 + \\(M-M)p t \\ 1 . 

M is a stochastic matrix and therefore ||Afx||i < ||x||i for any x. Thus 

Et+i < Hpt-ptlli + IKM-MJptHi 
= E t + \\(M-M)p t \\ 1 

< E t + 2e 

by equation 1101 Since Eq = 0, the final error is bounded by 

E„ < 2ne = 2n2"' 3 . 

E n is exactly the expression \\p — p U ni||i appearing in equation[Hl thus choosing (3 = O(logn) suffices to make 
-Eround polynomially small. 



4 Algorithm for Jones Polynomials 



With the encoding 77/, in place, the remaining task is to efficiently implement U pm (b) with a quantum circuit, 
as described in equation^ To do this, it suffices to efficiently implement U pm (at) for each crossing at- Then, 
to represent any m-crossing braid <Jt 1 <Jt 2 ■ ■ ■ a t m we can concatenate the corresponding quantum circuits to 
obtain C/" pm (cr tl )C/p m ((j t2 ) . . . U pm (a tm ). 

As discussed in section [2J p n ,k,h(ct) transforms only steps t and t + 1 in any path. Hence U pm (at) 
transforms only registers t and t + 1 of (3 qubits each in any encoded path. However, the transformation on 
these two steps depends on the rung I on which they start. Each register encodes whether a given step is 
up or down. Thus, I is encoded in the preceding £ — 1 registers. The number of rungs is fixed at k — 1, thus 
only a constant number of ancilla qubits (|~log 2 (fc — 1)]) are needed to store I. As discussed in section[U up 
to logarithmically many clean ancilla qubits can be simulated on a one clean qubit computer. We will now 
describe how to efficiently compute I into a register of 0(1) clean ancilla qubits using reversible computation. 
(We assume k is constant, unlike n.) 

To do this, we start by precomputing the cutoffs \2^p up (a, tj] for all 1 < a < k — 1 and < t < n on 
a standard classical computer. To store these numbers requires nk(3 bits, which for any fixed k is of order 
n log n. By equation^ we can compute these cutoffs by counting the number of paths of given length m < n 
that begin and end on given rungs. 

2 3 k~r 



Let A be the adjacency matrix of the line graph of k — 1 nodes 
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Then, the number of paths of length s from rung a to rung h is the a, h matrix element of A s . This can 
clearly be computed in poly(s, k) time. 

Suppose we have one register of c — |~log 2 (fc — 1)] qubits containing li, the rung of step i, and one 
register of c qubits initialized to zero in which we wish to write h+\. h+i is simply set to li + 1 or li — 1 
depending on whether the (i + l) th /3-qubit register in the encoding contains a number less or greater than 
the corresponding cutoff [2^p up (Zi, i)~\ . Comparing two numbers to see which is bigger can be done reversibly 
using logarithmically many ancillas, and the same is true for adding or subtracting 1 to a number. (See 
section 4 of |18] for a summary of the literature on reversible arithmetic with limited ancillas.) Since the 
cutoffs are hardcoded they do not need to be computed reversibly at all. Thus this whole process is doable in 
DQC1. One then uncomputes U and repeats this process until It is obtained for the desired t. Thus starting 
with Iq = 1, one can efficiently produce a register of qubits containing l t . 

To implement U pm {at) we first compute It and then use a unitary Ua- that acts on three registers: the 
t th and (t + l) th registers of (3 qubits from the encoding and the register containing l t . U a does not affect 
the l t register. Rather, l t controls what operation gets applied to the other two registers, as specified by the 
path model representation. Thus, after applying U a , one can uncompute /(. 

U a acts on 2/3+ |~log 2 (k— 1)] = C(log n) qubits. By general techniques it is possible to implement arbitrary 
unitary transformations on O(logn) qubits using poly(n) gates[15]. Thus efficiency is not a concern. We 
just need to construct a concrete unitary implementation of U a that gives the correct transformation on the 
encoded paths as specified by the path model representation. 

If the encoded path is 



then, by the path model representation, we merely need to apply a phase shift of A 1 . If the encoded path 
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then we must unitarily transform to some linear combination of the encodings of these two paths. 

There are 2 2 @ possible values for the bits contained in the relevant two registers. Suppose that the number 

of these bitstrings that encode is equal to the number of bitstrings that encode 3Z:. (As we shall see, 

these two numbers are equal up to rounding.) Let's call this number d. Then, we can use the labels: 



for the bitstrings that encode , and 



for the bitstrings that encode 



U a \l) 
U a \l) 



Therefore, for each j 6 {1, 
— ,j) - 10 (ai 



, ,d}, U a is 

-bi 



+ di 



» 



(11) 



in accordance with section [5] This is unitary and satisfies equation [5] 

Looking in more detail at rjh we can specify concretely the labelling. We can think of the contents of the 
t th and (t + l) th registers as specifying two numbers r t , r t+ i £ {0, 1, . . . , 2' 3 — 1}. Correspondingly we have 
the cutoffs 

C l t = \2^p up {l,t)\ 

C l t +i = \2^p np (l + l,t+ 1)1 
C l t +l=[2P PvLp (l-l,t+l)] 

The bitstrings with rt < Cj., rt+i > C t ~\\ encode , and the bitstrings with rt > C|, rt+\ < Ct+i 

encode SZ . By the definition of p up (a,t), the probability of hopping up and then down is the same as the 
probability of hopping down and then up. This is because both processes end on the same rung, thus each of 
these paths have the same number of completions ending at height h after n — t additional steps. Hence, up 

to rounding, the number of bitstrings from the 2 2/3 possibilities that encode is the same as the number 

of bitstrings that encode ^SZ^ . We can choose the label j from equation [TT1 to be: 



for 
for 



-ii+i 



o 



j = C l t X[n + (r t+ i 

] = C l t -\{rt-C l t )+r t+1 



(12) 



Because of rounding to the nearest integer, p up (a,t) is only calculated to accuracy ±2 @ . Thus, the 

number of bitstrings encoding can exceed the number of bitstrings encoding SZ by as much as ~ 2^ or 

vice versa. To achieve unitarity we define U a to act as the identity on these excess bitstrings. We therefore 
refer to these strings as "stuck" . 

We will now show that, because these stuck bitstrings form at most a ~ 2"^ fraction of all the 2 2/3 
encodings on which U G acts, the error introduced by them is negligible, provided (3 = fi(logn). We can 
divide the set of bitstrings {0, l}™' 3 into those that are stuck and those that are unstuck. By unitarity, no 
!7p m ((Tj) operator can ever transform an unstuck string into a stuck string or vice versa. 

The total error introduced by the stuck strings is 



-C^stuck 



— y 

On/3 Z-^i 



[(x\U pm (b)\x) - (i]h(x)\p n ,k,h(b)\r] h (x))} 



ze{o,i}™' 3 
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For unstuck strings, (x\U pm (b)\x) = (r)h(x)\Pn,k,h(b)\Vh(z)), thus 



E, 



stuck 



J^3 Y i( X \ U P^( b )\ x ) ~ (Vh{x)\p n ,k,h( b )\Vh(x))] 



xes 



where S is the set of stuck strings. By the triangle inequality 

©stuck < T^p Y iM U vm{b)\x)\ + \{Vh(x)\p n ,k,h(b)\ri h (x))\) ■ 



x£S 



By unitarity these matrix elements have at most unit magnitude, so 



E t 



tuck 



< 



■)nj3 Z^ 2 - 



xes 



Thus ©stuck is at most twice the fraction of strings in {0, l} n P that are stuck. For each i = 1, 2, . . . , n + 1, 
the pair of registers i and (i + 1) has probability approximately 2 _/3 of being stuck. Thus the fraction of 
bitstrings in which at least one pair is stuck is approximately 

1_(1_ 2 -/ 3 )". 

By choosing j3 = Q(\ogn) we can thus ensure that ©stuck is polynomially small. 
The total error E in the estimate of the normalized trace is 



E < -©round + ©stuck- 



We can see this as follows. 



E 



Ifi, 



1 Y (u\P-n t k,h(b)\v) ~ r-g Y ( x \ U P™(. b )\ x 



cjGO„ i 



a 



xe{o,i}™' 3 

1 Y {u\Pn,k,h{ b )\u) ~ ^Tg Y (Vh(x)\p n , k ,h( b )\Vh(x)) 



,k,h 



ue!)„ 



z€{0,l}" 



7^3 Y (Vh(x)\pn,k,h(b)\Vh(x)) - — g ^1 (x\U pm (b)\x 



2«/3 



z€{0,l}™' 3 



a;G{0,l}"' 3 



(We have added and subtracted ^f J2 x <e{o i}™' 3 ( 7 7ft( a; )l/ n,fc,/i(&)|'7h(a;)), leaving the total unchanged.) Apply- 
ing the triangle inequality we obtain 



E < 



i a. 



— | 51 (w|p n ,fe,h(6)|w) - ^g 51 (?// l (x)|p„,fej i (&)|?7, i (x)) 



<<->£^Tl,fc,/l 



xSlO,!}'" 3 



+ 2^ ^ (%( a; )l/'n,fe,/.( 6 )l'7fc( a: )) - 2^9 5Z (x\U pm (b)\x) 

xe{0,l}"' 3 x£{0,l}"P 

The first term is recognizable as © rc .und, and the second term is recognizable as ©stuck, thus we are done. 
Now that we know how to estimate the normalized trace, 



1 



I* 'n,fc,/i 



■Tl\pn,k,h{b)] 



for each h, we can do weighted classical sampling over h to obtain the Markov trace, as described in section 
121 Lastly, in accordance with equation^ we multiply by the easily computed prefactor 

(_ ze W2fc)3»(L) (_ 2 cos ( 7r / fc ))«-l 

to obtain an estimate of the Jones polynomial. 



12 



Figure 5: The Young diagrams with four boxes. They correspond to the partitions of the number four. 



5 HOMFLY polynomials 

As discussed in section [T] the Jones polynomial is equivalent to a special case of a more general knot invariant 
called the single- variable HOMFLY polynomial. Let L be the trace closure of a braid b <G B n . To make L an 
oriented link, every strand of the braid is oriented downward. The single- variable HOMFLY polynomial is 

4V 27r/fc ) = ( t m ^T ~ le "' (r+1)e(feWfc ^ (7rn ' fc,r(6)) (13) 

where ir n ^,r is the Jones- Wenzl representation of B n , Tr indicates its Markov trace (to be defined shortly), 
and e(b) is the sum of the exponents appearing in b when written in terms of the generators a±, . . . , u n ~i- 
Thus, e(b) is minus the writhe of L. For each n and fc, the Jones- Wenzl representation is a unitary represen- 
tation of the group B n , whose dimension is exponential in n. In section [6] we will describe how to efficiently 
implement this unitary representation with quantum circuits, thereby allowing the efficient estimation of 
single- variable HOMFLY polynomials using one clean qubit. In the present section we will first describe the 
Jones- Wenzl representation and its Markov trace. Our presentation closeljo follows that of [2D] . 

The Jones- Wenzl representation of B n , the braid group of n-strands, is formulated in terms of standard 
Young tableaux of n boxes. For any n, the Young diagrams are all the possible partitions of n boxes into 
rows, where the rows are arranged in descending order of length. All the Young diagrams for n = 4 are 
illustrated in figure [5] For a given Young diagram A the corresponding standard Young tableaux are all the 
numberings of boxes so that if we started with no boxes, and added boxes in this order, the configuration 
would be a valid Young diagram after every step. An example is shown in figure [6] 

For the reader intrigued by the appearance of Young tableaux we make the following aside. Young 
tableaux were originally introduced to construct representations of the symmetric group S n (c/. [5]). B n is 
closely related to S n ; the latter is obtained from the former by adding the relation of = 1. Any representation 
p of the symmetric group must satisfy p{pi) 2 — 1. By deforming this relation to p(<7i) 2 = (— £~ 3//4 + 
£ 1 / 4 )p(o'i) + t~ x l 2 \ we obtain the path model representation of B n . (The correspondence between paths and 
standard Young tableaux and the relationship between the path model and Jones- Wenzl representations are 
explained in the appendix.) This type of deformation appears frequently in mathematics and is referred to 
as a quantum deformation or q-deformation. In the limit t — ► 1 we recover a representation of S n . The 
origin of the term quantum deformation is the commutation relation pq — qp = iht among the position and 
momentum operators in quantum mechanics. This is a deformation of the classical commutation relation 
pq-qp = 0. 

We now describe in detail the Jones- Wenzl representation of B n . Let T n j. r be the set of standard Young 
tableaux of n boxes and at most r rows, such that after every step, the configuration is not only a valid Young 
diagram, but also has the property that the number of boxes in the first row minus the number of boxes in 
the r th row is at most k — r. Let W n ,k,r be the formal span of T n> k,r- For given n, k,r, the Jones- Wenzl 
representation is a group homomorphism ~K n ,k,r '■ B„ — ► U(W n ,k,r) from the braid group B n to the group of 
unitary transformations on the vector space W n ,k,r- 

The elementary crossings cti, . . . , u n -\ generate the braid group B n . Thus, to specify the representation 
K n ,k,r it suffices to specify the representations of these crossings 



'However, for consistency with 0)18^, we use k and r to represent the parameters called / and k, respectively, in |20| . 
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Figure 6: Above we show an example of a standard Young tableau, and beneath it the corresponding sequence of 
Young diagrams. Above each Young diagram is listed the number of boxes in the first row minus the number of boxes 
in the third row. (In some diagrams the number of boxes in the third row is zero). The maximum value taken by 
this difference is three. Thus, the standard Young tableau shown is a member of Tr t k,3 for k = 6, 7, 8, . . ., but not for 
k= 1,2,3,4,5. 



7r n ,fe,r(ci)i ■ ■ • , Tn,fe,r(<7n-i)) as is done by the following rule. For any A e T n 



k,r- 



7Tn,fc,r(0' l )A 



j7,(i-d z (A))/k sin(7r/fc) 

sin(ndi(A)/k) 



T //c 



sin 2 (7r/fc) 
sin 2 (7rdi(A)/fc) 



A', 



(14) 



where A' is the Young tableau obtained from A by swapping boxes i and i + 1, and di(A) is the "axial" 
distance from box i to box i + 1 in A. That is, if box i appears in row r^(A) and column Cj(A) and box i + 1 
appears in row r^+i(A) and column Cj+i(A) then 



di(A) = c,-(A) - c !+1 (A) - ( r< (A) - r i+1 (A)) . 



(15) 



For some A S T n> k >r , the Young tableau A' obtained by swapping boxes i and i + 1 is not contained in T n ^, r - 
However, one can verify that in such cases, the coefficient </l — s^/Ld^tA \/ k \ is always zero. Thus, equation 
1141 defines a linear transformation strictly within W Ui k,r- 

By swapping boxes, one never changes the shape of a Young tableau. Thus, the Jones- Wenzl representa- 
tion is reducible, with invariant subspaces corresponding to different Young diagrams. The Markov trace is 
the following weighted sum of the traces over these subspaces. 



Tr(7r„ Ar (6)) = > S^>Tr(^,(b)) 



(A)r 



rW 



(16) 



where ir n k r is the Jones- Wenzl representation on the subspace corresponding to Young diagram A, Tr denotes 
the ordinary matrix trace, and the sum is over all Young diagrams of n boxes and at most r rows such that 
the number of boxes in the top row minus the number of boxes in the r th row is at most k — r. The weights 
S k I are given by 



(A) _ ( sin(7r/fc) 



sin(7r(j — i + r)/k) 
siiL(jrr/k)J ■*;*■ sin(jrhi t j (A) / k) 



n 



(17) 



where the product is over all (row, column) coordinates in the Young diagram A, and hij (A) is the "hook 
length" of the box at row i, column j. That is, hij(X) is the number of boxes to the right of box (i,j) in 
row i plus the number of boxes below box (i,j) in column j, plus 1. This is illustrated in figure [7] 

6 Algorithm for HOMFLY polynomials 

Because of the close relationship between the path model representation and the Jones- Wenzl representation, 
the one clean qubit algorithm for estimating the single-variable HOMFLY polynomial of the trace closure 
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p 



Figure 7: In the Young diagram shown above, the hook length of the box at position (2,2) is four. In general the 
hooklength of a box is the number of boxes in the "hook" that includes the box itself, all the boxes to the right of it 
in the same row, and all the boxes below it in the same column. 



of a braid is a fairly direct generalization of the Jones polynomial algorithm of sections [3] and 2J For any 
fixed k and r the runtime of the algorithm scales polynomially with n. However, we do not have polynomial 
scaling with r. 

We need an encoding that maps bitstrings to standard Young tableaux. Let T; I be the set of Young 
tableaux in T n ^. r compatible with Young diagram A. For each A we introduce 

Vx :{0,l} n ^T™ r . 
In order to get a uniformly weighted trace, we must construct a v\ with the property that 



2 n/3 



l"A(A)l*-en (is) 

I n,fc,rl 

for each AeT^ r . To design a mapping v\ satisfying equation[THJ we think in terms of a classical randomized 

algorithm for uniformly sampling from X^ fc using nj3 random bits. The algorithm works similarly to the 
algorithm described in section [3] for sampling from the paths £l ni k,h- The main difference is that at each 
step in a path, one has at most two choices: step up or step down, whereas at each step in the sequence 
corresponding to a Young tableau of r rows, one can have as many as r choices: add a box to any row. To 
ensure a uniform sampling from T^ ^ , we must probabilistically make this choice as follows. After choosing 
the first t < n steps we have a Young tableau A t e Tt,k,r- Let R k (A t ) be the number of Young tableaux in 
T; l obtainable by starting with A t and adding the remaining n — t boxes. Let Aj be the Young tableau 
obtained from A t by adding the next box to row j. At each step we must add a box to row j with probability 

Note that there are two cases where R k (A J t ) = 0. The first is when j — 1, and by adding this last box to 
the top row we violate the condition that the number of boxes in the top row of A J t minus the number of 
boxes in the bottom row of A^ must be at most r — k. The second case is when A t has an equal number of 
boxes in rows j and j — 1. Thus by adding a box to row j we obtain an invalid Young diagram. 

To generate a random element of T^ I we use n registers of (3 random bits. We think of these registers 
as encoding numbers r 1; . . . , r n in the range 0, 1, . . . , W — 1. Let F be the cumulative distribution function 

^(A t ,A)=^^ A) (A t ), (20) 

i=l 

with F (A t , A) = 0. The (t + l) th box is added to row j if and only if 

\F^ 1 (A tl X)2^ <r t < \F 3 (A t ,\)2^. 
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By doing this, we choose which row to add each box to approximately according to equation ll9l By essentially 
the same argument given in section |H it suffices to use probabilities p- (A t ) accurate to within ± — \ , . 
Hence, we can again choose f3 = O(logn). 

For each <jj € B n we show how to efficiently implement a quantum circuit t/jw(cO such that for almost 
all x, y S {0, l} n/5 , 

(x\u.M^)\y) * (M*)\*nlM)Mv))- (2i) 

By concatenating these circuits, we can efficiently implement the Jones- Wenzl representation of any braid 
of polynomially many crossings. Then, by using the one clean qubit algorithm for trace estimation, we can 
approximate the HOMFLY polynomial of the trace closure of the braid. 

tv I _(<Xj) transforms only boxes i and i + 1. By the definition of v\ the location of these two boxes is 
encoded in the i th and (i + l) th register of /3 qubits each. Thus, C/jw(°'i) transforms only these two registers. 
By equation [14] it is apparent that the transformation performed on these two registers depends on the axial 
distance between the boxes they describe. Less obviously, the transformation depends on the cutoffs 

\F j (A h X)2% \F f (A i+1 ,\)2?] 

for certain relevant (j,j f ). This is because these cutoffs determine the encoding v\ between Young tableaux 
and bitstrings. 

The axial distance and the cutoffs are encoded in the preceding (i— 1) /3-qubit registers. We'll show how to 
extract the relevant information into logarithmically many ancilla qubits, so that the transformation t/jw(ci) 
can be implemented by a quantum circuit acting on only logarithmically many qubits. By the general 
construction of [15] , any unitary on logarithmically many qubits can be implemented using polynomially 
many quantum gates. 

Rather than directly computing cutoffs and axial distances, we'll work in terms of other quantities which 
are easier to extract from the first (i — 1) registers. Recall that a Young tableau can be thought of as a 
sequence of steps by which to build a final Young diagram, adding one box at a time. Let bj(t) be the 
number of boxes in row j after t steps. bi(t), b 2 (t), • • • , b r (t) completely describe the Young diagram of step 
t. We can do a change of variables, defining 

ci(t) = h(t) + b 2 (t) + . . . + b r (t) = t 
c 2 {t) = bx(t)-b 2 (t) 
c 3 {t) = b 2 (t)-b 3 {t) 



C r (t) = b r -l(t)-b r (t). 

The (r - l)-tuple 

c(t) = (c 2 (t),c 3 (t),...,c r (t)) 

defines the "profile" of the Young tableau, as illustrated in figure [H These profiles are higher dimensional 
analogues to the rungs in the path model. The restriction to k rungs is here replaced with the restriction 
to profiles in which c 2 + C3 + . . . + c r < k — r. The Jones- Wenzl representation acts on the space of Young 
tableaux which correspond to walks on these profiles, just as the path model representation acts on the space 
of paths which correspond to walks on the rungs. 

We'll next show how to extract c(i — 1) into 0((r — 1) log n) clean ancilla qubits. Once we do this, we can 
implement Ujwifn) because its action on the i th and (i + l) th registers is completely determined by c(i — 1). 
In order to compute c(i — 1) we need to know the cutoffs \Fj(A t , A) 2^] for all t < % — 1 and all relevant 
j. The key thing to notice about \F 3 (A t , A)2^] is that it depends only on i, j, A and the profile of A t , not 
on any of A t 's internal details. As a result, for any fixed_| r, k, and A, there are only polynomially many 



4 As described at the end of this section, we classically sample over A. Thus, each time we run the one clean qubit computer 
A has some random fixed value. 
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Figure 8: As an example we use r — 3, k = 2. We display the corresponding Young diagrams for each allowed profile 
{c-2,Cz), where C2 is the "overhang" of the top row over the second row, and c-$ is the overhang of the second row 
over the bottom row. As we add boxes, the length of these overhangs changes. Thus, each Young tableau in T n ,2,3 
uniquely corresponds to an n-step walk on the six allowed profiles. 



cutoffs we need to compute, which we can see as follows. C2, C3, . . . , c r are all upper bounded by k — r, thus 
(k — r) r ~ l provides a loose upper bound on the number of allowed values of c(t). t runs from 1 to n and j 
runs from 1 to r. Thus the total number of cutoffs we need to compute is upper bounded by m(k — r) r ~ l , 
which is exponential in r, but for any fixed r is polynomial in n. Thus we can classically precompute all of 
the necessary cutoffs and store them in a classical lookup table. 

We will classically compute, for each of the allowed profiles of c(£), and each j and t, the corresponding 
cutoff 

[F J (c(t),A,t)2' 3 l. (22) 

To do this, we imagine a directed graph with vertices corresponding to the allowed profiles. An edge leads 
from profile a to profile b if b can be obtained from a by adding one box. This is illustrated in figure [51 If 
we take the adjacency matrix A of this graph, and raise it to power s, the matrix elements are equal to the 
number of ways of getting from one profile to another using s steps. In this way, we can obtain the value of 
R k (At) needed in equation [151 This is the number of ways to get from c(t) to c(n) (the profile of A) using 
n — t steps. Similarly, we can obtain R k (Aj), which is the number of ways of getting to c(n) by starting 
with the profile of A J t and making n — t — 1 steps. Thus, after computing the relevant powers of A, we can 
then efficiently compute each \ Fj (c(t) , \, t)2@~\ using equations IT51 and |2"01 

Given our table of cutoffs, the procedure for computing c(t) is a simple iteration. Suppose we know 
c[t — 1). To obtain c(£) we compare the t th register of (3 qubits to the relevant cutoffs 

fFi(S(t-l) > A,t- 1)2*1,..., \F r {%i-l),\,i-l)2^ 



If the t box is added to row j, then we decrement Cj 



to determine which row the t th box is added to. 
(unless j — 1) and increment Cj+i. 

The i th and (i + l) th registers together with the ancilla qubits containing c(i — 1) encode the locations of 
boxes i and i + 1. Thus, we can perform the transformation E/jw(0i), as specified by equations 1141 and 1211 
using a quantum circuit that acts only on these qubits. More specifically, this quantum circuit performs a 
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Figure 9: Continuing the example in figure [8] we choose r — 3, k — 2. We display a representative Young diagram for 
each allowed profile. The arrows represent the allowed transitions between these profiles by adding one box. A is the 
adjacency matrix of this directed graph. The arrows to the right represent the addition of a box to the top row, the 
arrows diagonally downward represent the addition of a box to the middle row, and the arrows diagonally upward 
represent the addition of a box to the bottom row. After adding a box to the bottom row we omit the leftmost 
complete column, as per the notation of figure [8] 



unitary transformation on the i th and (i + l) th registers that depends on the content of the ancilla qubits. 
The ancilla qubits themselves are not transformed. 

The unitary transformation performed on the i th and (i + l) th registers is one which rotates between 
the encodings of a pair standard Young tableaux which differ by having boxes i and (i + 1) swapped. v\ is 
not injective, but the number of bitstrings which encode these two tableaux are approximately equal. We 
illustrate this with an example. Suppose *■= H+H . Consider the following pair of standard Young tableaux 
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The total number of standard Young tableaux of shape A whose first five boxes appear in the configuration 
shown at left is the same as the number of standard Young tableaux of shape A whose first five boxes appear 
in the configuration shown at right. This is because this number depends only on the shape of the dashed 
region. Returning to the general case, we see that swapping a pair of labelled boxes can never change the 
shape of the dashed region. By the definition of v\, the fraction of the 2 2 ^ possible bitstrings for registers 
i and % + 1 that encode a given configuration of boxes i and i + 1 is proportional to the fraction of Young 
tableaux of shape A in which the boxes are in that configuration. Hence, number of bit assignments for 
registers i and i + 1 that encode a given configuration is equal to the number that encode the configuration 
in which boxes i and i + 1 are swapped, up to rounding. Thus we can always make some canonical matching 
between the bitstrings encoding the two configurations. The encoded version of transformation [T4l is then 
to unitarily rotate between the current bitstring and its canonical matching. 

In the case of Jones polynomials, we specified the canonical matching in eauation ll2l Here due to greater 
complexity we do not specify any formula for the matching. Instead, while computing all the cutoffs, one can 
at the same time make arbitrary choices for the corresponding matchings and write them down. A complete 
lookup table of these choices can be stored using polynomially many bits because 2(3 — O(logn). Given the 
choices of matchings, one can then use equation 1141 to calculate all the matrix elements of [/jw(ci). This 
matrix has polynomial dimension since it acts only on the two registers (3 — O(logn) qubits each plus the 
O(logn) ancillas encoding c(i — 1). It can therefore be implemented by an efficient quantum circuit using 
the method of [15j . After performing the unitary transformation, c(i — 1) can be uncomputed. 

As mentioned above, because of rounding, the number of bitstrings encoding the swapped and unswapped 
pair of boxes are not precisely equal, only approximately equal. Thus our canonical matching will in general 
have a small number of unpaired bitstrings encoding one of the two tableaux. As we did for Jones polynomials 
we define the unitary transformation to act as the identity on these excess bitstrings outside of the matching. 
By an analysis essentially identical to that in section[4]one can see that choosing (3 logarithmic in the number 
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of crossings suffices to ensure that these unmatched bitstrings form a small enough fraction so that their 
effect on the trace of the circuit is negligible. 

By the above procedure we can construct an efficient quantum circuit for £/j\y(ci) satisfying equation 
[5T]for any i and any A. By concatenating these, we can thus obtain a quantum circuit for Ujyy(b) for any 
b G B n of polynomially many crossings. If L is the link obtained by taking the trace closure of b with each 
strand oriented downward then the corresponding HOMFLY polynomial is given by the Markov trace 

4^^) = (|^)""e-- (r+1)e(b)/fc E^ )Tr (<i^ 6 ) 

For any A we can estimate the normalized trace of Ujw(b) to polynomial precision using the standard one 
clean qubit algorithm for trace estimation. Thus, we can estimate the HOMFLY polynomial by classically 
sampling from the possible Young diagrams A according to the distribution 

o(A)i/p(A) I 

,.% °k,r\ J -n,k,r\ 



and estimating the corresponding normalized trace 

1 



Tr(C/ JW (6)) 

n,k,r I 



To do this we need to compute the values of S k r and |T„ k r | for each allowed A. It is not hard to see 

are 
fixed k and r, the number of values of S k r we need to compute is independent of n. It is clear by equation 



for each A sampled. 
To do this we n 
that the allowed n-box Young diagrams are in bijective correspondence with the allowed profiles. Thus for 

the 

1171 that each S k T can be classically computed in polynomial time. Similarly, for fixed k and r, there are 
only poly(n) different values of |T„ kr \ to compute. \T n kr \ — R k {%), thus each \T^ k r \ can be computed 
in polynomial time using the algorithm for computing R k described earlier. 

7 Conclusion 

In this paper we have shown that one clean qubit computers can in polynomial time obtain additive approx- 
imations to the Jones and HOMFLY polynomials of the trace closure of braids at arbitrary roots of unity. 
This generalizes the result of [18] which showed that one clean qubit computers can efficiently approximate 
the Jones polynomial of the trace closure of braids at the fifth root of unity. In [18] it was also shown 
that this problem is DQCI-complete. The completeness proof is based on the fact that the image of the 
path model representation p n ^ : B n — > U(V n ^) modulo global phase is dense in SU{V n ,b)- By the results 
of [7J, this density result holds also for all k other than 1,2,3,4, and 6, and similar density results hold 
for the Jones-Wenzl representation. Thus it is natural to conjecture that DQCl-completeness extends to 
Jones polynomials beyond k = 5 and to HOMFLY polynomials. DQCI-completeness would imply that the 
additive approximations achieved by the algorithms here cannot be achieved in polynomial time by classical 
computers unless DQCf C P. Such completeness questions provide a promising direction for further research. 
Another direction is to generalize the algorithm even further. For evaluating the Jones polynomial when 
t is not a root of unity, the relevant representation of the braid group is nonunitary. In [2] , Aharonov et al. 
give a general quantum algorithm to approximate Jones polynomials at all values of t and to evaluate Tuttc 
polynomials. They achieve this by interacting the computational qubits with an "environment" of ancilla 
qubits thereby inducing nonunitary dynamics on the computational qubits. It would be interesting to see 
whether similar techniques can be carried over to the one clean qubit model. 
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Figure 10: For the special case of two rows, the Young tableaux of n boxes become equivalent to paths of n steps. 
Adding a box to the top row corresponds to a step up, and adding a box to the bottom row corresponds to a step 
down. 



8 Acknowledgements 

We thank Peter Shor for useful discussions. During the research and writing of this paper SJ was at Center 
for Theoretical physics at MIT, the Digital Materials Laboratory at RIKEN, and the Institute for Quantum 
Information at Caltech. SJ thanks these institutions as well as the Army Research Office (ARO), the 
Disruptive Technology Office (DTO), the Department of Energy (DOE), Franco Nori and Sahel Ashab at 
RIKEN, and John Preskill at Caltech. PW gratefully acknowledges support from NSF grants CCF-0726771 
and CCF-0746600. PW would like to thank Eddie Farhi's group for their hospitality and the W. M. Keck 
Foundation for partial support. 

A Jones Polynomials from HOMFLY polynomials 

As shown in figure [H a Young tableau corresponds to a process by which a Young diagram is built up by 
adding one box at a time. If r = 2 then the Young diagram has two rows (although at some steps the second 
row may be empty) . This process can therefore be completely described by listing the difference between the 
number of boxes in the first and second rows at each step. The values of this difference correspond to the 
rungs of the ladder in the path model, as illustrated in figure [TUl The values appearing in the path model 
representation, as defined in section [2j can be rewritten as follows. 



n, 



C-l 



. _,>2i+i smfTr/k) 
ze 2 * 



sin(jrl/k) 



h = di 
ej = Si 



te 



in 21. , / , I sin(7r/fc) 



sin(7r//fc) 



r/2fc 



Thus, comparing the path model representation to equation 1141 shows that 

Kn,kA°i) = ie l37T/2k '■pn,k{<7i). 



(23) 



As shown in section 5 of 20J , the weights in the Markov trace for the Jones- Wenzl representation simplify 
substantially in the case r = 2. Specifically, the weights S k r given in eg uat ion 1171 simplify to 



D /c,2 — 



sin(7rZ(A)/fc) 



sin(7r/fc)(2cos(7r/fc)) r 



where l(X) is the number of boxes in the top row of A minus the number of boxes in the bottom row of A 
plus 1. By the correspondence of figure [TOl I is the final rung of the corresponding path. The Markov trace 
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Figure 11: Shown are the two Markov moves. Here the boxes A and B represent arbitrary braids. Note that Markov 
move II increases the number of strands by one. 



of the Jones- Wenzl representation of the identity braid is 1 . Thus 

A 

and so 



°fc,2 — 



EyKV 2 \Mrt(X)/k) 



sm(nl(\)/k), 



where the sum over A' is over all Young diagrams of n boxes and 2 rows such that l(X') < k. Comparison with 
equation [3] shows that the weighted traces appearing in the Jones and HOMFLY polynomials are weighted 
identically in the case r = 2. This fact and equation l23l show that for any braid b e B n , 

Tr(^ M (&)) = (ie i3 ^ 2k )< b ^(p n , k (b)), (24) 

where e(b) is the sum of the exponents appearing in b when written in terms of the generators <7i, . . . , <y n -\. 
Substituting equation 1241 into equation 1131 and simplifying yields 

Hf{^l k ) = i e W(2cos(7r/fc)) n - 1 e-' i3e ^ 7r / 2fe Tr(p„ ifc (6)) 

where L is the directed link obtained by taking the trace closure of the braid b. e(b) is minus the writhe of 
L. Thus, comparison with equation [2] shows 

H^\e l2n/k ) = (-i)- 2w< -^(-l) n - 1 V i: (e l27T/k ) 
= (— l)"'W+ n ~ 1 y r -(e i27T V fc ). 

The sign discrepancy (— l) t0 ( L )+™~ 1 is itself a link invariant, and is therefore inconsequential. To show 
this we use Markov's theorem, which states that the oriented link obtained by taking the trace closure of 
braid b\ is equivalent to the oriented link obtained by taking trace closure of braid 62 if and only if b\ can be 
transformed into &2 by some finite sequence of the two Markov moves shown in figure [TT]( and their inverses). 
It is easy to see that the factor (— l) t0 ( L )+™ _1 is invariant under both Markov moves for all braids and is 
therefore an invariant of the corresponding trace closures. 
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